Spring
TinyML for Speech Recognition
--We train and deploy a quantized 1D convolutional neural network model to conduct speech recognition on a highly resource-constrained IoT edge device. This can be useful in various Internet of Things (IoT) applications, such as smart homes and ambient assisted living for the elderly and people with disabilities, just to name a few examples. In this paper, we first create a new dataset with over one hour of audio data that enables our research and will be useful to future studies in this field. Second, we utilize the technologies provided by Edge Impulse to enhance our model's performance and achieve a high Accuracy of up to 97% on our dataset. For the validation, we implement our prototype using the Arduino Nano 33 BLE Sense microcontroller board. This microcontroller board is specifically designed for IoT and AI applications, making it an ideal choice for our target use case scenarios. While most existing research focuses on a limited set of keywords, our model can process 23 different keywords, enabling complex commands. Natural Language Processing (NLP) and Speech Recognition are crucial domains in Artificial Intelligence (AI). While NLP deals with enabling computers to analyze, understand, reason on, and generate human language in textual form, speech recognition is concerned with that in spoken form.
- North America > United States > Texas > Harris County > Spring (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Health & Medicine (1.00)
- Information Technology > Smart Houses & Appliances (0.54)
Automated Duplicate Bug Report Detection in Large Open Bug Repositories
Laney, Clare E., Barovic, Andrew, Moin, Armin
Many users and contributors of large open-source projects report software defects or enhancement requests (known as bug reports) to the issue-tracking systems. However, they sometimes report issues that have already been reported. First, they may not have time to do sufficient research on existing bug reports. Second, they may not possess the right expertise in that specific area to realize that an existing bug report is essentially elaborating on the same matter, perhaps with a different wording. In this paper, we propose a novel approach based on machine learning methods that can automatically detect duplicate bug reports in an open bug repository based on the textual data in the reports. We present six alternative methods: Topic modeling, Gaussian Naive Bayes, deep learning, time-based organization, clustering, and summarization using a generative pre-trained transformer large language model. Additionally, we introduce a novel threshold-based approach for duplicate identification, in contrast to the conventional top-k selection method that has been widely used in the literature. Our approach demonstrates promising results across all the proposed methods, achieving accuracy rates ranging from the high 70%'s to the low 90%'s. We evaluated our methods on a public dataset of issues belonging to an Eclipse open-source project.
- North America > United States > Texas > Harris County > Spring (0.04)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
- North America > United States > Virginia > Fairfax County > Fairfax (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (2 more...)
Model-Driven Quantum Code Generation Using Large Language Models and Retrieval-Augmented Generation
This paper introduces a novel research direction for model-to-text/code transformations by leveraging Large Language Models (LLMs) that can be enhanced with Retrieval-Augmented Generation (RAG) pipelines. The focus is on quantum and hybrid quantum-classical software systems, where model-driven approaches can help reduce the costs and mitigate the risks associated with the heterogeneous platform landscape and lack of developers' skills. We validate one of the proposed ideas regarding generating code out of UML model instances of software systems. This Python code uses a well-established library, called Qiskit, to execute on gate-based or circuit-based quantum computers. The RAG pipeline that we deploy incorporates sample Qiskit code from public GitHub repositories. Experimental results show that well-engineered prompts can improve CodeBLEU scores by up to a factor of four, yielding more accurate and consistent quantum code. However, the proposed research direction can go beyond this through further investigation in the future by conducting experiments to address our other research questions and ideas proposed here, such as deploying software system model instances as the source of information in the RAG pipelines, or deploying LLMs for code-to-code transformations, for instance, for transpilation use cases.
- North America > United States > Colorado > El Paso County > Colorado Springs (0.05)
- North America > United States > Texas > Harris County > Spring (0.04)
Engineering fantasy into reality
"One of the dreams I had as a kid was about the first day of school, and being able to build and be creative, and it was the happiest day of my life. And at MIT, I felt like that dream became reality," says Ballesteros. Growing up in the suburban town of Spring, Texas, just outside of Houston, Erik Ballesteros couldn't help but be drawn in by the possibilities for humans in space. It was the early 2000s, and NASA's space shuttle program was the main transport for astronauts to the International Space Station (ISS). Ballesteros' hometown was less than an hour from Johnson Space Center (JSC), where NASA's mission control center and astronaut training facility are based.
- North America > United States > Texas > Harris County > Spring (0.25)
- North America > United States > Texas > Travis County > Austin (0.05)
- North America > United States > Florida > Brevard County (0.05)
- North America > United States > California > Los Angeles County > Pasadena (0.05)
- Government > Space Agency (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Automated Machine Learning: A Case Study on Non-Intrusive Appliance Load Monitoring
Moin, Armin, Wattanavaekin, Ukrit, Lungu, Alexandra, Rössler, Stephan, Günnemann, Stephan
We propose a novel approach to enable Automated Machine Learning (AutoML) for Non-Intrusive Appliance Load Monitoring (NIALM), also known as Energy Disaggregation, through Bayesian Optimization. NIALM offers a cost-effective alternative to smart meters for measuring the energy consumption of electric devices and appliances. NIALM methods analyze the entire power consumption signal of a household and predict the type of appliances as well as their individual power consumption (i.e., their contributions to the aggregated signal). We enable NIALM domain experts and practitioners who typically have no deep data analytics or Machine Learning (ML) skills to benefit from state-of-the-art ML approaches to NIALM. Further, we conduct a survey and benchmarking of the state of the art and show that in many cases, simple and basic ML models and algorithms, such as Decision Trees, outperform the state of the art. Finally, we present our open-source tool, AutoML4NIALM, which will facilitate the exploitation of existing methods for NIALM in the industry.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- North America > United States > Texas > Harris County > Spring (0.04)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
- (3 more...)
- Research Report > Promising Solution (0.49)
- Overview > Innovation (0.35)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)
Mining Software Repositories for Expert Recommendation
Marshall, Chad, Barovic, Andrew, Moin, Armin
--We propose an automated approach to bug assignment to developers in large open-source software projects. This way, we assist human bug triagers who are in charge of finding the best developer with the right level of expertise in a particular area to be assigned to a newly reported issue. Our approach is based on the history of software development as documented in the issue tracking systems. Our approach works based on the bug reports' features, such as the corresponding products and components, as well as their priority and severity levels. We sort developers based on their experience with specific combinations of new reports. The evaluation is performed using T op-k accuracy, and the results are compared with the reported results in prior work, namely T opicMiner MTM, BUGZIE, Bug triaging via deep Reinforcement Learning BT -RL, and LDA-SVM. The evaluation data come from various Eclipse and Mozilla projects, such as JDT, Firefox, and Thunderbird. Large open-source projects offer an issue tracking system or open bug repository, where developers and users can report the software defects they find or any new feature requests they may have. These reports are called bug reports or issues . In some cases, developers can volunteer to work on the reported issues they find interesting or relevant to their field of expertise. Additionally, they sometimes report issues and assign them to themselves. However, in many cases, particularly in large open-source projects, a group of developers, called bug triagers, decide who should process and fix a newly reported issue.
- North America > United States > Texas > Harris County > Spring (0.04)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Software (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
- (3 more...)
Automated Bug Report Prioritization in Large Open-Source Projects
--Large open-source projects receive a large number of issues (known as bugs), including software defect (i.e., bug) reports and new feature requests from their user and developer communities at a fast rate. The often limited project resources do not allow them to deal with all issues. Instead, they have to prioritize them according to the project's priorities and the issues' severities. In this paper, we propose a novel approach to automated bug prioritization based on the natural language text of the bug reports that are stored in the open bug repositories of the issue-tracking systems. We conduct topic modeling using a variant of LDA called T opicMiner-MTM and text classification with the BERT large language model to achieve a higher performance level compared to the state-of-the-art. Experimental results using an existing reference dataset containing 85,156 bug reports of the Eclipse Platform project indicate that we outperform existing approaches in terms of Accuracy, Precision, Recall, and F1-measure of the bug report priority prediction. Index T erms --automated bug prioritization, automated bug triage, mining software repositories, machine learning, natural language processing I. I NTRODUCTION Large open-source projects offer an issue-tracking system with an open bug repository, where developers and users can report the software defects they find or any new feature requests they may have. These reports are called bug reports . However, the projects' resources are limited, while processing and resolving the bug reports is typically very costly. Hence, not all bug reports in the open bug repository can be processed and handled at once.
- North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
- North America > United States > Texas > Harris County > Spring (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > India > NCT > New Delhi (0.04)
Bug Destiny Prediction in Large Open-Source Software Repositories through Sentiment Analysis and BERT Topic Modeling
Pope, Sophie C., Barovic, Andrew, Moin, Armin
This study explores a novel approach to predicting key bug-related outcomes, including the time to resolution, time to fix, and ultimate status of a bug, using data from the Bugzilla Eclipse Project. Specifically, we leverage features available before a bug is resolved to enhance predictive accuracy. Our methodology incorporates sentiment analysis to derive both an emotionality score and a sentiment classification (positive or negative). Additionally, we integrate the bug's priority level and its topic, extracted using a BERTopic model, as features for a Convolutional Neural Network (CNN) and a Multilayer Perceptron (MLP). Our findings indicate that the combination of BERTopic and sentiment analysis can improve certain model performance metrics. Furthermore, we observe that balancing model inputs enhances practical applicability, albeit at the cost of a significant reduction in accuracy in most cases. To address our primary objectives, predicting time-to-resolution, time-to-fix, and bug destiny, we employ both binary classification and exact time value predictions, allowing for a comparative evaluation of their predictive effectiveness. Results demonstrate that sentiment analysis serves as a valuable predictor of a bug's eventual outcome, particularly in determining whether it will be fixed. However, its utility is less pronounced when classifying bugs into more complex or unconventional outcome categories.
- North America > United States > Wisconsin > La Crosse County > La Crosse (0.14)
- North America > United States > Texas > Harris County > Spring (0.04)
- North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)
WavePulse: Real-time Content Analytics of Radio Livestreams
Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay
Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > New York > Kings County > New York City (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (215 more...)
- Media > Radio (1.00)
- Leisure & Entertainment (1.00)
- Government > Voting & Elections (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation
Brundage, Miles, Avin, Shahar, Clark, Jack, Toner, Helen, Eckersley, Peter, Garfinkel, Ben, Dafoe, Allan, Scharre, Paul, Zeitzoff, Thomas, Filar, Bobby, Anderson, Hyrum, Roff, Heather, Allen, Gregory C., Steinhardt, Jacob, Flynn, Carrick, hÉigeartaigh, Seán Ó, Beard, SJ, Belfield, Haydn, Farquhar, Sebastian, Lyle, Clare, Crootof, Rebecca, Evans, Owain, Page, Michael, Bryson, Joanna, Yampolskiy, Roman, Amodei, Dario
This report surveys the landscape of potential security threats from malicious uses of AI, and proposes ways to better forecast, prevent, and mitigate these threats. After analyzing the ways in which AI may influence the threat landscape in the digital, physical, and political domains, we make four high-level recommendations for AI researchers and other stakeholders. We also suggest several promising areas for further research that could expand the portfolio of defenses, or make attacks less effective or harder to execute. Finally, we discuss, but do not conclusively resolve, the long-term equilibrium of attackers and defenders.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
- Asia > China (0.14)
- Asia > Russia (0.14)
- (25 more...)
- Overview (1.00)
- Research Report > New Finding (0.67)
- Instructional Material > Course Syllabus & Notes (0.45)
- Transportation > Air (1.00)
- Media > News (1.00)
- Leisure & Entertainment > Games (1.00)
- (13 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- (6 more...)